Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging
نویسندگان
چکیده
We consider d-dimensional linear stochastic approximation algorithms (LSAs) with a constant step-size and the so called Polyak-Ruppert (PR) averaging of iterates. LSAs are widely applied in machine learning and reinforcement learning (RL), where the aim is to compute an appropriate θ∗ ∈ R (that is an optimum or a fixed point) using noisy data and O(d) updates per iteration. In this paper, we are motivated by the problem (in RL) of policy evaluation from experience replay using the temporal difference (TD) class of learning algorithms that are also LSAs. For LSAs with a constant step-size, and PR averaging, we provide bounds for the mean squared error (MSE) after t iterations. We assume that data is i.i.d. with finite variance (underlying distribution being P ) and that the expected dynamics is Hurwitz. For a given LSA with PR averaging, and data distribution P satisfying the said assumptions, we show that there exists a range of constant step-sizes such that its MSE decays as O(1t ). We examine the conditions under which a constant step-size can be chosen uniformly for a class of data distributions P , and show that not all data distributions ‘admit’ such a uniform constant step-size. We also suggest a heuristic step-size tuning algorithm to choose a constant step-size of a given LSA for a given data distribution P . We compare our results with related work and also discuss the implication of our results in the context of TD algorithms that are LSAs.
منابع مشابه
Iterate-averaging sign algorithms for adaptive filtering with applications to blind multiuser detection
Motivated by the recent developments on iterate averaging of recursive stochastic approximation algorithms and asymptotic analysis of sign-error algorithms for adaptive filtering, this work develops two-stage sign algorithms for adaptive filtering. The proposed algorithms are based on constructions of a sequence of estimates using large step sizes followed by iterate averaging. Our main effort ...
متن کاملRate of Convergence for Constrained Stochastic Approximation Algorithms
There is a large literature on the rate of convergence problem for general unconstrained stochastic approximations. Typically, one centers the iterate n about the limit point then and normalizes by dividing by the square root of the step size n. Then some type of convergence in distribution or weak convergence of Un, the centered and normalized iterate is proved. For example, one proves that th...
متن کاملParallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging
This work characterizes the benefits of averaging techniques widely used in conjunction with stochastic gradient descent (SGD). In particular, this work sharply analyzes: (1) mini-batching, a method of averaging many samples of the gradient to both reduce the variance of a stochastic gradient estimate and for parallelizing SGD and (2) tail-averaging, a method involving averaging the final few i...
متن کاملStochastic Recursive Inclusions with Non-Additive Iterate-Dependent Markov Noise
In this paper we study the asymptotic behavior of stochastic approximation schemes with set-valued drift function and non-additive iterate-dependent Markov noise. We show that a linearly interpolated trajectory of such a recursion is an asymptotic pseudotrajectory for the flow of a limiting differential inclusion obtained by averaging the set-valued drift function of the recursion w.r.t. the st...
متن کاملStochastic Recursive Inclusions in two timescales with non-additive iterate dependent Markov noise
In this paper we study the asymptotic behavior of a stochastic approximation scheme on two timescales with set-valued drift functions and in the presence of non-additive iterate-dependent Markov noise. It is shown that the recursion on each timescale tracks the flow of a differential inclusion obtained by averaging the set-valued drift function in the recursion with respect to a set of measures...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1709.04073 شماره
صفحات -
تاریخ انتشار 2017